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Abstract Software health management (SWHM) tech- 
niques complement the rigorous verification and valida- 
tion processes that are applied to safety-critical systems 
prior to their deployment. These techniques are used 
to monitor deployed software in its execution environ- 
ment, serving as the last line of defense against the 
effects of a critical fault. SWHM monitors use informa- 
tion from the specification and implementation of the 
monitored software to detect violations, predict pos- 
sible failures, and help the system recover from faults. 
Changes to the monitored software, such as adding new 
functionality or fixing defects, therefore, have the po- 
tential to impact the correctness of both the monitored 
software and the SWHM monitor. In this work, we 
describe how the results of a software change impact 
analysis technique, Directed Incremental Symbolic Ex- 
ecution (DiSE), can be applied to monitored software 
to identify the potential impact of the changes on the 
SWHM monitor software. The results of DiSE can then 
be used by other analysis techniques, e.g., testing, de- 
bugging, to help preserve and improve the integrity of 
the SWHM monitor as the monitored software evolves. 
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1 Introduction 

The size and complexity of software in safety-critical 
systems have increased considerably over time, due in 
part to the addition of richer feature sets, more au- 
tomation, and continued efforts to improve the safety 
and reliability of these systems. As a result, the task of 
verifying and validating these larger and more complex 
software systems has become much more challenging 
and time consuming, requiring new techniques to help 
ensure the reliability 1 and correctness of safety-critical 
systems. 

Software health management (SWHM) techniques 
have recently been developed to complement the vari- 
ous verification and validation processes applied to safety- 
critical systems prior to their deployment [12,29, 37-39] . 
Monitors are at the core of these techniques. SWHM 
monitors observe and analyze the system in its execu- 
tion environment during runtime to detect and respond 
to violations, and to predict possible failures in the near 
future. These monitors are often implemented as soft- 
ware components, and as a result also require some level 
of analysis to ensure their correctness and reliability. 
The analysis of monitors is important because moni- 
tors often serve as a last line of defense against the po- 
tentially catastrophic effects of faults in safety-critical 
systems. 

Software changes are inevitable in most deployed 
systems - successful software systems evolve as require- 
ments change and defects are fixed. Even in safety- 
critical systems, software is rarely exempt from change 
after deployment. For example, the discovery of a crit- 
ical defect (bug) in the system may require an update 
to the operational software to avoid a system failure. 

1 We use the term reliability to mean ‘continuity of correct 
service’ as specified in [4], 
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Systems also undergo change when new functionality 
is added. Changing operational software, however, is 
known to be risky. Even small changes to the code 
can have a major impact on how the software executes. 
Moreover, bug fixes may not always fix the defect and 
can potentially introduce new defects. For example, a 
recent study showed that 14.8%-24.4% of the sampled 
fixes for post-release bugs in several large, mature oper- 
ating systems were incorrect and had a negative impact 
on the end users [48]. 

SWHM monitors use information from the specifi- 
cation and implementation of the monitored software 
to determine which values to analyze and to determine 
tolerance ranges for those values. This tight coupling of 
monitor and monitored software means that the impact 
of a change to the monitored software has the potential 
to also impact the SWHM monitor and its correctness 
in the context of the change. Various change impact 
analysis techniques have been developed to identify the 
differences between two program versions in order to 
guide testing and verification efforts on the changed 
software [1,6,14,17,20,28,36,45]. The objective of these 
techniques is to reduce the time and cost of testing and 
verification of the changed system by guiding the anal- 
ysis towards the parts of the system impacted by the 
changes. However, to the best of our knowledge, change 
impact analysis techniques have not been explored in 
the context of how information about changes to mon- 
itored software can be used to help identify the impact 
of the changes on the SWHM monitor. 

In previous work [28,34], we present Directed Incre- 
mental Symbolic Execution (DiSE), a change impact 
analysis technique for computing the effects of program 
changes in terms of program execution behaviors. The 
DiSE technique can be used throughout the software 
development lifecycle, to help guide software engineer- 
ing tasks such as testing, debugging, and regression ver- 
ification tasks whenever software changes are necessary. 
In this work, we explore how the results of DiSE, ap- 
plied to the monitored software, can be further lever- 
aged in order to maintain and improve the integrity of 
SWHM monitor software. The main contributions of 
our work include: 

— We describe how the results of the DiSE change im- 
pact analysis on monitored software can be used to 
identify the impact of the changes on the SWHM 
monitor software. 

— We apply our technique to a system with a SWHM 
monitor modeled as a Bayesian network and evalu- 
ate its cost and effectiveness with the following two 
research questions: 

RQ1: How does the cost of applying DiSE to the 
monitored software compare with using tradi- 


tional symbolic execution to compute the impact 
of the changes? 

RQ2: How does the number of impacted path condi- 
tions generated by DiSE compare with the num- 
ber path conditions generated by traditional sym- 
bolic execution? 

— We describe how the results of DiSE on the mon- 
itored software can be used to help validate and 
update the SWHM monitor software to preserve 
and improve its integrity as the monitored software 
evolves. 

2 Software Health Management 

The size and complexity of software in safety-critical 
systems is increasing rapidly as more components are 
added to facilitate automation. The number of sensors 
and actuators on aircraft has steadily increased over 
time, as has the software to control and monitor these 
devices. More sophisticated algorithms for the autopi- 
lot, navigation, collision detection and avoidance, and 
other on-board systems, have also contributed to the 
increase in software. In recent times, we have also seen 
a shift of responsibilities from pilots to automated sys- 
tems for a large number of tasks. In the next generation 
of aircraft, we expect to see continued growth in the size 
and complexity of the software to enable even more au- 
tomation in these systems. 

Rigorous design, verification, and certification pro- 
cesses have been established to check the correctness 
of safety-critical software before it is deployed. How- 
ever, the size and complexity of the systems prohibit 
exhaustive testing and verification. Moreover, it may 
not be possible to anticipate or re-create particular en- 
vironmental conditions for verification purposes, and 
therefore parts of the system may not be tested prior 
to deployment. In order to address these limitations, 
SWHM techniques have been proposed to monitor the 
software after it is deployed. 

Building on decades of research in systems and ve- 
hicle management, together with research in software 
runtime verification, SWHM techniques [12,29,38,39] 
have been developed to support monitoring of software 
as it executes and interacts with the hardware (sensors 
and actuators) after deployment. SWHM monitors per- 
form fault detection, isolation and recovery. They can 
also monitor for assumption violations and other con- 
ditions that are useful in post-flight analysis. Software 
health management software often serves as a guardian 
to the system during its operational phase, ensuring its 
correct and safe operation. 

Software health managers have been developed to 
monitor the values of sensors and variables in software, 
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as well as updates to these variables, the health of the 
sensors and variables, and also to compute the likeli- 
hood of failure using Bayesian networks [38,39]. Soft- 
ware health managers have also been developed to mon- 
itor the health of components in the system; detect- 
ing anomalies, identifying and isolating the fault causes 
of the anomalies (when feasible), prognosticating fu- 
ture faults, and when possible, mitigating the effects of 
faults [12]. Software monitors have also been generated 
and used in embedded systems with hard real-time con- 
straints, to sample variables in the monitored software 
and implement fault tolerant algorithms to determine 
the health of the monitored software [29] . 

SWHM systems have been implemented at both the 
model-level [38,39] and at the code- level [12,29]. Model- 
level systems are tested using model simulations on a 
wide range of sensor inputs and various values for inter- 
nal parameters. The analysis of the model is intended 
to provide confidence in the correctness of the expected 
behavior of the model. Code-level SWHM systems often 
have the same V&V requirements as the monitored soft- 
ware, e.g., to achieve a particular level of code coverage 
during testing, and they are expected to satisfy similar 
verification conditions as the monitored software. 

In addition to functional properties, SWHM mon- 
itor software is typically expected to preserve certain 
non-functional properties, such as non-interference and 
timing properties. For example, SWHM monitors must 
not interfere with the monitored software, e.g., change 
the behavior of the monitored software (unless the mon- 
itored property has violated a contract). They must also 
avoid corrupting any data or causing any crashes in 
the system. SWHM monitors also must not miss vio- 
lations or alarms, and should minimize the number of 
false alarms. 

The verification and validation of SWHM monitor 
software, similar to any other software, is an ongoing 
process that is necessary throughout the development 
lifecycle to ensure changes to the system have their in- 
tended effects and that no unintended behaviors were 
introduced by the changes. In the case of SWHM moni- 
tors, verification and validation of the monitor may also 
be required when the monitored software is changed, 
due to the tight coupling between the monitor and the 
monitored software, e.g., through the values that are 
monitored. Techniques that identify what is changed 
and the impact of the changes on the SWHM moni- 
tor play an important role in maintaining the health 
(correctness) of SWHM monitor software. Before pre- 
senting our technique for maintaining the health of soft- 
ware monitors, we first provide background on software 
change impact analysis techniques and discuss some of 


the challenges associated with computing precise soft- 
ware change impact information. 

3 Change Impact Analysis 

Software change impact analysis techniques [3] are used 
to detect the parts of a program affected by the changes 
made to the code. Given the evolutionary nature of 
software development, these techniques play a critical 
role in software development and maintenance, where 
even a one line fix can potentially have unintended and 
even disastrous consequences. The results computed by 
change impact analysis techniques have been widely 
used to support software maintenance tasks, such as re- 
gression testing [21,23,33,41], regression verification [5, 
40], studying changes in large code bases [32], and for 
automated generation of program documentation [7]. 

Given two closely related program versions, change 
impact analysis techniques are performed in two steps: 

1) compute the differences between program versions, 
i.e. , the change set, and 2) using the change set as in- 
put, compute the impact of the the differences, i.e., the 
impact set. The change set can be computed based on a 
variety of program representations. Computing changes 
based on source code is commonly used in practise be- 
cause it is efficient and automated. The differencing 
techniques based on textual differences, however, are 
often sensitive to formatting and syntactic changes that 
may not affect the way the program executes. 

Differences computed based on some graphical rep- 
resentation of the code, e.g., Abstract Syntax Tree (AST), 
Program Dependence Graph (PDG) , Control-Flow Graph 
(CFG), are in general, more precise than differences 
computed on the source code as is. The graphical repre- 
sentations of code encode additional information about 
the program, e.g., control and data dependences, that 
is useful for computing more precisely the impact of 
source code changes. Any technique, however, that com- 
putes change impact based strictly on differences in the 
source code structure will have limited capabilities to 
reason about the impact of the changes on the execution 
of the code. This is especially true when dynamically 
allocated data and complex control structures such as 
loops and recursion are present in the program. 

Once the change set is computed, change impact 
analysis techniques compute which parts of the program 
may be impacted by the changes. The impact of changes 
can be computed using information from a static repre- 
sentation of the program or using dynamic information 
obtained through program execution. Techniques such 
as [2] analyze a static representation of the program 
to compute the impact set in terms of program state- 
ments that may be directly or indirectly impacted by 
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the changes to the source code. Godefroid et al. stati- 
cally check whether previously-computed symbolic test 
summaries are still valid, i.e. , not impacted by code 
changes, to support compositional dynamic test gener- 
ation [13]. 

Techniques which use dynamic information [20,33] 
have the potential to compute more precise impact sets 
because they are based on actual program execution 
paths. Dynamic analysis is typically driven using a set 
of test cases, so the impact sets will be computed with 
respect to the specific execution paths explored, which 
may be a small subset of all feasible execution paths. 
Other recent work has explored the use of symbolic ex- 
ecution results to compute precise change impact char- 
acterizations by systematically exploring the program 
execution space [15,31,27,42,45]. The results of these 
change impact analysis techniques have been used to 
support a range of software evolution tasks, including 
test case selection and test suite augmentation; how- 
ever, scalability is an issue for these techniques. 

The change impact analysis used in this work, Di- 
rected Incremental Symbolic Execution (DiSE) [28,34], 
combines the efficiency of static program analysis tech- 
niques with the precision of dynamic analysis techniques 
to compute the impact of software changes in terms of 
program execution behaviors. This approach results in 
a more precise impact set than using static analysis 
alone, and also addresses the scalability issues associ- 
ated with symbolic execution by using the results of the 
static analysis to direct symbolic execution towards the 
parts of the program impacted by the changes. This ef- 
fectively ‘prunes’ the program behaviors that are not 
impacted by the changes to the code. Because the re- 
sults of DiSE are computed in terms of program exe- 
cution paths, they can be used to support a range of 
software maintenance tasks, including regression test- 
ing, debugging and regression verification. 

In the following sections we describe the DiSE al- 
gorithm and illustrate how the impact set for a small 
working example is computed. We then describe a novel 
application of DiSE results computed on the monitored 
software to help maintain and improve the integrity of 
the SWHM monitor software. 

4 Directed Incremental Symbolic Execution 

Directed Incremental Symbolic Execution (DiSE) [28, 
34] is a program analysis technique for computing the 
impact of changes to software. The output of DiSE is a 
characterization of the effects of code changes on pro- 
gram execution behaviors. The effects are character- 
ized in terms of the inputs to the program and the ef- 
fects of execution on variables in the program. In pre- 


vious work [5,28,34] we describe how DiSE is a general 
change impact analysis and how the change impact re- 
sults computed by DiSE can be used for various soft- 
ware maintenance and evolution tasks, including test 
case selection and prioritization for regression testing, 
debugging, and regression verification. In this work we 
describe a novel application of DiSE results to soft- 
ware health management and discuss how the results 
computed by DiSE can be used to help ensure the cor- 
rect operation of a SWHM monitor when the monitored 
software is changed. 

The novelty of the DiSE change impact analysis 
is to leverage the efficiencies of static analysis tech- 
niques for computing the impact of program changes to 
guide a more precise analysis technique, symbolic exe- 
cution, to explore and characterize program execution 
paths that may be impacted by the changes. Our work 
was inspired by Regression Model Checking (RMC), 
a technique which uses the results of a static analysis 
to explore the ‘dangerous’ elements in the state space 
whose behavior may be impacted by the changes to 
the code [46]. DiSE differs from RMC in that DiSE is 
based on incremental symbolic execution, rather than 
model checking, and DiSE does not require analysis re- 
sults to be carried forward as the software evolves - 
only the source code for two related program versions 
is required. 

An overview of the DiSE analysis is shown in Fig. 1. 
The inputs to DiSE are two related program versions: 
the original source ( S ) and the modified source (S'), 
and a source- level syntactic diff between S and S' . The 
source-level diff provides information about the source 
code lines that are changed, added, and removed be- 
tween S and S'. A control flow graph (CFG) for S', 
shown in Fig. 1, is constructed from the source of S' 
and used to guide symbolic execution towards impacted 
program behaviors. 

There are two phases of analysis in DiSE. Phase I 
estimates the impact of the differences on the source 
code of S' . Phase II uses the information generated in 
phase I to compute, with better precision, the impact of 
the changes on the program execution behaviors. The 
two phases are shown in Fig. 1. The output of DiSE 
is the set of program behaviors in S' impacted by the 
differences between S and S' . The set of Impacted Pro- 
gram Behaviors can then be used to identify the impact 
of the changes on the software health manager ( SWHM 
Monitor) responsible for monitoring the software, as 
demonstrated in Fig. 1. 

In the remainder of this section we describe each 
phase of DiSE. In Section 5 we describe how the results 
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Fig. 1 Overview of DiSE. The inputs to DiSE are two syn- 
tactically similar program versions, S and S' , and a source- 
level dijj between S and S' . The output of DiSE is a set of 
impacted program behaviors that can be used to manage the 
health of a SWHM monitor. 

of DiSE can be used to help identify the impact of the 
changes on the software health manager. 


4.1 Source Impact Analysis 

In the first phase of DiSE, information about the added, 
removed and changed lines of code is used by data- 
and control-flow analyses to mark additional lines of 
code that may be impacted by the differences between 
S and S'. The static analysis computes the impact of 
the changes by analyzing a control-flow graph repre- 
sentation of the source code. A data-flow analysis is 
used to identify where variables are defined and used, 
but the concrete (actual) values of variables in the pro- 
gram that are possible within its environment are not 
computed. This under-approximation of the execution 
environment makes the analysis efficient and scalable to 
larger programs. Furthermore, the analyses in phase I 
are conservative, i.e. , every source line of code that may 
be impacted by the change, will be marked as impacted. 
This ensures that the source impact analysis does not 
miss marking any instructions that are impacted by the 
changes, although it may also mark code as impacted, 
when in fact, it is not. 


Consider the annotated control flow graph (CFG) 
for the modified source, S' in Fig. 1. Each node in 
the CFG (riQ...ng and n en d) represents a single line 
of source code. Nodes no and n en d are the respective 
entry and exit nodes to the program. All nodes in the 
CFG are reachable from no, and n en d is reachable from 
all nodes in the CFG. The edges between the nodes 
represent the flow of control between the different pro- 
gram statements during execution. The shaded node, 
ri 2 , represents the changed source line of code based on 
the results of the source- level diff comparing S and S'. 

During the source impact analysis phase of DiSE, 
impacted nodes are computed by starting with the set 
of changed nodes and then using data- and control-flow 
information to compute the impacted nodes (program 
statements). In Fig. 1, the nodes annotated with V (n 0 , 
ri 2 , « 4 , nr, ns, ng, n io, and n en d) represent the source 
lines impacted by the change. The output of this anal- 
ysis is the set of source lines of code in S’ impacted by 
the differences between S and S' . This information is 
used to direct the more precise analysis, symbolic ex- 
ecution, in phase II of DiSE. In the remainder of this 
section we provide a high-level description of the DiSE 
algorithm. The reader is referred to [28,34] for a de- 
tailed description of the analyses implemented in DiSE. 

4-1.1 Estimating Impact Based on Control Flow 

Conditional branch statements, e.g., if and while, com- 
pare the values of the specified variables and constants 
and then follow the appropriate branch based on the 
results of the comparison. Explicit changes to a condi- 
tional statement, i.e., changes to the comparison oper- 
ator or the operands, may impact which code block is 
executed as a result of the change. For example, con- 
sider the following code fragment: 

int condTest(int x){ 

1: if (x > 0) 

2 : return x + 1 ; 

3: else 

4: return x - 1; 

5: > 

This code returns x + 1 when the input value of x is 
greater than 0, and when the input value of x is not 
greater than 0, the code returns x—1. The execution be- 
havior of this code can be summarized in various ways. 
An example summarization based on the program in- 
puts and outputs is: “for any input value of x , the pro- 
gram never returns 0 or 1.” Suppose the comparison 
operator at line 1 is changed to ‘>=’. As a result of the 
change to the code, the execution behavior of the pro- 
gram is impacted and can now be summarized as — “for 
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any input value of x, the program never returns 0 or 
— 1.” This example demonstrates how a change to the 
comparison operation of a conditional branch statement 
impacts the behavior of the program. 

Recall that the source impact analysis does not con- 
sider the execution environment of the program, i.e., 
the possible values of x during runtime. Instead, the 
analysis will conservatively estimate that the change in 
the conditional branch statement will impact all state- 
ments whose execution are dependent on the result of 
the comparison operation. For the condTest example 
shown above, when the comparison operator is changed 
at line 1, the analysis estimates both line 2 and line 4 to 
be impacted. The analysis considers all changes to con- 
ditional branch statements, including the addition and 
deletion of conditional branch statements, when esti- 
mating the impact of the changes on the control flow of 
the program. 

4-1.2 Estimating Impact Based on Data Flow 

Changes to an assignment statement in a program may 
impact the value of program variables. And, as a result, 
other assignment statements, return statements, and 
comparison operations that execute after the changed 
assignment statement and use (read) the changed value 
may also be impacted by the change. Consider the fol- 
lowing code fragment: 

int dataTest(int x){ 

1 : x = x + 1 ; 

2 : tmp = 0 ; 

3: if (x > 0) 

4 : tmp = x + 1 ; 

5: else 

6 : tmp = x - 1 ; 

7 : return tmp ; 

8 : > 

In this example, suppose the assignment to x at line 
1 is changed to x = x — 1. Using a data-flow analy- 
sis (which also takes into account the control flow of 
the program), the analysis would then identify the im- 
pacted statements as: (a) the assignment statements at 
lines 4 and 6 where the value of x is used to compute 
the value assigned to the tmp variable, (b) the condi- 
tional branch statement at line 3 where the value of x 
is read and compared with 0, and (c) the return state- 
ment that reads the value of tmp. Note that the return 
statement is marked as impacted because a transitive 
closure is computed for the data-flow analysis. 

The source impact analysis performed in phase I 
of DiSE is guaranteed to terminate. In the worst case, 
all of the source lines in the program are marked as 


impacted. Such a case would generally be observed for 
a program that has a very high coupling between its 
components and variables. Another case is when the 
change is made to a part of the program that interacts 
with all of the other parts of the program. In general, we 
do not expect the small, incremental changes made to a 
system to impact the entire program. The complexity of 
the source impact analysis is polynomial in the number 
of source lines of code. 


4.2 Directed Symbolic Execution 

Before we present the details of the second phase of 
DiSE, we first provide a brief description of symbolic 
execution. 

4-2.1 Symbolic Execution 

Symbolic execution [9, 19] is a non-standard approach 
to program execution that uses symbolic values in place 
of concrete (actual) values for program inputs. The out- 
put values are computed as expressions defined over 
constants and the symbolic input values, and using a 
specified set of operators. To illustrate symbolic execu- 
tion, we use the following code fragment: 

int y; 

int testX(int x){ 

1: if (x > 0) 

2 : return y + 1 ; 

3: else 

4: return y - 1; 

5: > 

To perform symbolic execution on this code fragment, 
two symbolic variables are used: Y represents the sym- 
bolic value of the integer field y, and X is the symbolic 
integer value used to represent x, the integer argument 
to testX. During symbolic execution, a path condition 
is used to collect constraints on the program inputs 
that will result in execution of the current path. In this 
example, symbolic execution computes two path condi- 
tions: (1) when X > 0, the value Y + 1 is returned, and 
(2) when -<(X >0), the value of Y — 1 is returned. The 
program behavior summarization would be as follows: 

1. X > 0 A ret == Y + 1 

2. -i(X > 0) A ret == Y — 1 

where ret indicates the return value of the method. 

During symbolic execution, the current path condi- 
tion is checked for satisfiability. A decision procedure 
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is used to check if there exists an assignment of val- 
ues to the program variables that will make the con- 
straints in the path condition satisfiable (true). When 
the constraints on the path condition are not satisfi- 
able, the execution path is marked as infeasible. The 
execution stops along an infeasible path and the search 
backtracks. In programs with loops and recursion, in- 
finitely long execution paths may be generated. In or- 
der to guarantee termination of the execution in such 
cases, a user-specified depth bound is provided as input 
to symbolic execution. Whenever the size of the current 
execution path reaches this user-specified depth bound, 
the search backtracks. 

At the end of symbolic execution, all of the path 
conditions generated are collected into a symbolic sum- 
mary. Each path condition in the symbolic summary 
represents a set of (feasible) concrete execution paths. 
The path conditions in a symbolic summary can be used 
as input to other program analysis techniques, such as 
regression testing. For example, the values of a solved 
path condition form the set of concrete input values 
that will cause the program to execute that path in the 
program, and as such can be used to generate or select 
regression tests. 

4-2.2 Computing Impacted Path Conditions 

The second phase of DiSE performs a form of incremen- 
tal symbolic execution on the modified version of the 
program. DiSE directs symbolic execution to explore 
only the parts of the program that are impacted by the 
changes to the code. DiSE leverages the set of impacted 
source lines computed in the previous phase, and the 
reachability information encoded in the CFG as input 
in order to explore a subset of the feasible execution 
paths. When no impacted statements are reachable on 
the current path, symbolic execution backtracks, avoid- 
ing the cost of unnecessarily exploring and characteriz- 
ing execution paths in the modified version of the pro- 
gram that are not impacted by the change(s) to the 
program. 

There is another important aspect of pruning within 
DiSE that sets it apart from other change-impact anal- 
ysis techniques. DiSE prunes certain symbolic execu- 
tion paths by exploring only a subset of the possible 
choices. DiSE may prune choices at conditional branch 
statements, e.g., when these statements are not marked 
as impacted in phase I, even if other impacted source 
lines are reachable from the block. We demonstrate the 
intuition for this pruning through an example: 

int a, b; 

int pruneTest (int x, int y){ 

1: if ( x > 0 ) 


2 : a = x + 1 ; 

3: else 

4 : a = x - 1 ; 

5: if ( y > 0)* 

6 : b = y + 2 ; * 

7 : else* 

8 : b = y - 2 ; * 

9: > 

Suppose, the source lines of code identified with an 
* are marked as impacted in phase 1. There are two 
conditional statement blocks, one block at lines 1 — 4 
is controlled by the value of variable x, while the other 
block at lines 5 — 8 is controlled by the value of vari- 
able y. There are four possible symbolic paths in this 
program: 

1. (X > 0) A (Y > 0) 

2. (X > 0) A -.(T > 0) 

3. -.(X > 0) A (Y > 0) 

4. -.(X > 0) A -.(y > 0) 

Since the first conditional block is not impacted by the 
change, DiSE explores only one choice for the value of 
x, i.e. , explores the same path through the unimpacted 
code (X > 0 or -i(X >0)) for all program executions 
through the unimpacted code. As a result, DiSE prunes 
two of the paths shown above, e.g., it prunes 1 and 2, or 
3 and 4. Which paths are pruned is determined by the 
search strategy implemented by the symbolic execution 
engine, e.g., random, greedy, default. 

The resulting set of path conditions computed by 
DiSE then characterizes the set of program execution 
behaviors in the modified version of the procedure that 
are impacted by the change(s). These path conditions 
serve as the input to the change-impact analysis pre- 
sented in Section 5. 

4-2.3 Scalability and Limitations of DiSE 

Scalability The use of symbolic execution to compute 
impacted program behaviors is the primary factor af- 
fecting the scalability of the DiSE algorithm. Recent 
advances in reduction and abstraction techniques, con- 
straint solving, raw computing power, and in the devel- 
opment of novel reuse techniques such as [43,47], have 
helped to improve the scalability of symbolic execu- 
tion. These improvements to symbolic execution can be 
leveraged to help improve the scalability of DiSE. The 
smaller symbolic summaries computed by DiSE bene- 
fit the program analysis techniques which use the DiSE 
results by reducing the scope of the analysis to the pro- 
gram behaviors impacted by the differences. 
Limitations The DiSE algorithm was originally im- 
plemented as an intraprocedural analysis. In [34] we 
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present iDiSE, an interprocedural version of our algo- 
rithm. The current versions of the DiSE and iDiSE al- 
gorithms do not compute the impact of changes to dy- 
namically allocated data or changes to global data, e.g., 
fields in Java classes; however, we are working on a ver- 
sion of the iDiSE algorithm to compute the impact of 
these types of changes. 

Other limitations of our change impact analysis are 
related to the limitations inherent with the use of sym- 
bolic execution. In Section 4.2.1 we explain how a user- 
specified depth bound may be necessary to avoid in- 
finitely long execution paths when the loop bounds are 
unknown a priori. Other limitations related to symbolic 
execution include the availability of the underlying the- 
ories in the decision procedures used by the symbolic 
execution engine. For example, to reason about non- 
linear arithmetic and operations on complex data struc- 
tures and library operations on those structures. It is 
interesting to note that these limitations are actually 
part of the motivation for DiSE - our goal was to avoid 
the program structures which contribute to these lim- 
itations whenever possible by exploring only the parts 
of the symbolic execution space that is impacted by the 
changes to the code. 


5 Application 

In previous work [28,34], we discuss how the results 
of DiSE can be used to support regression testing tech- 
niques and delta debugging techniques. The path condi- 
tions generated by DiSE along impacted program state- 
ments are solved to facilitate regression testing tasks. 
We present an evaluation in [28,34] that demonstrates 
how the solutions to the impacted path conditions can 
be used for better test case selection and augmenting 
the existing test suite compared to just using symbolic 
execution. We also show how the output of DiSE can 
be configured for generating test inputs that satisfy dif- 
ferent coverage criteria, e.g., impacted branch coverage, 
impacted statement coverage, among others. The infor- 
mation about which constraints are generated at im- 
pacted program locations can be used to improve the 
efficiency of delta debugging as shown in [34] . We have 
analyzed synchronous reactive components from the au- 
tomotive as well as the avoinics domain. For example, 
we have previously analyzed versions of the Altitude 
Switch (ASW) application that turns power on to a 
device of interest when the aircraft descents below a 
threshold altitude above ground level. We have also an- 
alyzed NASA’s On-board Abort Executive (OAE) that 
models the Crew Exploration Vehicles’ prototype as- 
cent abort handling software. 


In this section, we discuss a new application of DiSE, 
demonstrating its utility in maintaining a software health 
management framework that uses a Bayesian network. 
We demonstrate the value of the change-impact infor- 
mation computed by DiSE in facilitating the process 
of managing the health of the SWHM monitor as the 
monitored software is changed. We first present back- 
ground information on Bayesian networks and discuss 
the advantages of using Bayesian networks for software 
health management. We then present an example soft- 
ware health management system modeled as a Bayesian 
network, and describe how the change-impact infor- 
mation about the monitored software can be used to 
help update and test the software health manager rep- 
resented as a Bayesian network. 


5.1 Bayesian Networks 

Bayesian networks are used to reason about data in 
the presence of uncertainty [11,25]. A Bayesian net- 
work is a directed acyclic graph where the nodes in the 
graph represent statistical variables in the system, and 
the edges between the nodes represent dependencies be- 
tween the different variables in the system. Recent work 
has explored using Bayesian networks to model soft- 
ware health management systems [37-39] . The software 
health manager monitors various software and hard- 
ware systems. Data from the hardware and software 
sensors is presented as evidence to the nodes in the 
Bayesian network. Based on the data, the Bayesian net- 
work reasons about failures and root causes for the fail- 
ures in the system (hardware or software) being moni- 
tored. 

Bayesian networks contain multiple types of nodes 
as used in this approach for SWHM. Each type of node 
has a specific role in the system. Command nodes re- 
ceive signals that are interpreted as commands. Sensor 
nodes receive signals that provide data about the vari- 
ables in the monitored hardware or software. Health 
nodes indicate the health status of a sensor. Status 
nodes encode the unobservable status of a particular 
sub-system. And, behavior nodes connect various nodes 
in the network in order to recognize behavioral pat- 
terns. 

Bayesian networks have several advantages for mod- 
eling software health management systems. For exam- 
ple, they have full forward and backward reasoning ca- 
pabilities. In forward reasoning, the network calculates 
the probabilities on the status of the health nodes based 
on the values of the sensor nodes; this provides diag- 
nosis and ‘most likely’ root cause explaining the cur- 
rent data. In backward reasoning, when the network 
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observes a certain diagnosis, it can reason about which 
sensors are most likely broken. 

5.2 Example Program 

We use the example source code shown in Fig. 2 to 
demonstrate the challenges of maintaining the health 
of a software health manager in the context of evolving 
systems (in this case, software changes). The code frag- 
ment shown in Fig. 2 is a simplified version of the Wheel 
Brake System (WBS). The WBS is a synchronous re- 
active component derived from the WBS case example 
found in ARP 4761 [18,35]. The Java code is based on 
a Simulink model translated to C using tools developed 
at Rockwell Collins and manually translated to Java. 
The goal of this code is to determine how much braking 
pressure to apply based on the environment. It consists 
of one Java class and a total of 231 source lines of code. 
The code shown in Fig. 2 is a simplified version of the 
Java program. 

Two versions of the method update (int PedalPos , 
int BSwitch, int PedalCmd) are shown in Fig. 2. In 
both versions of the program, the update method sets 
the value of two global variables, AltPress and Meter, 
based on the input values of its arguments. The ver- 
sion on the left, Fig. 2(a), is the original version, and 
the version on the right, Fig. 2(b), is the modified ver- 
sion. The change to the code is on line 9 of the update 
method in Fig. 2(b), where an additional else clause 
is added to the update method. This code creates an 
additional case for checking the value of PedalPos and 
setting the value of PedalCmd. 

An example Bayesian network for the code example 
in Fig. 2 is shown in Fig. 3. The sensor nodes for the 
input variables PedalPos, BSwitch , and PedalCmd are 
labeled respectively in Fig. 3. The nodes in the Bayesian 
network labeled with the prefix H_ are health nodes. For 
example, H_PedalPos is a health node that monitors the 
health of the sensor node PedalPos. The node’s proba- 
bilities give an indication about the health of the com- 
ponent. The node usually has two states “healthy” and 
“bad” where the summation of their probabilities is one: 
p(healthy) + p(bad) = 1. The nodes in the Bayesian 
network labeled with the prefix IP are command nodes 
that represent update variables in the system. Variables 
in the system are identified by their node label. The fi- 
nal updates to the global variables flow to the nodes 
labeled AltPress and Meter in Fig. 3. The edges be- 
tween the nodes represent dependencies. For example, 
the updates to the PedalCmd variable in Fig. 2(a) are 
only possible for certain values of PedalPos. The as- 
signment of the value is contingent on the conditional 
statements at line 5 or 7 in Fig. 2(a) evaluating to true. 



Fig. 3 The Bayesian network for the example in Fig. 2. 


For the purposes of maintaining the health of the 
SWHM monitor, it is useful to identify the areas in the 
Bayesian network that are not impacted by the changes 
to the monitored code; these parts of the network do 
not need to be re-analyzed or re-tested, potentially lead- 
ing to a considerable savings in the maintenance costs. 
Even just the basic ability to mark the impacted nodes 
in the Bayesian network is useful because it facilitates 
manual inspection of the network. In large Bayesian 
networks, the impact analysis can be especially useful 
to detect the parts of the network that are impacted by 
the changes made to the monitored software. 

By visual inspection of the graph in Fig. 3, we can 
see that there are two disjoint graphs. The change at 
line 9 in Fig. 2(b) corresponds to the shaded nodes 
in Fig. 3. Without any additional information, we can 
infer that all of the nodes in the top graph are impacted 
by the changes, whereas the nodes in the bottom graph 
are not impacted by the changes. In the next section, 
we illustrate how the change-impact results computed 
by DiSE can be used to mark the subset of the nodes 
in the top graph of the Bayesian network in Fig. 3 as 
impacted. 

5.3 Running DiSE 

The input to DiSE for the example shown is Fig. 2 
is the modified program in Fig. 2(b) and the set of 
modified source lines of code, which for this example is a 
singleton set: {9 : else PedalCmd = PedalPos* 1}. This 
additional statement is added to the code in order to 
cover all of the possible cases within the first conditional 
block. The set of modified source lines can be efficiently 
computed using any source-level diff tool. 

The static analysis algorithm in DiSE is applied to 
the modified version of the source code. The change to 
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1: /* Global State Variables */ 

2: int AltPress := 0 
3: int Meter := 2 
4: 

int update(int PedalPos , int BSwitch , int PedalCmd ) 
5: if PedalPos == 0 then 
6: PedalCmd = PedalPos + 1 

7: else if PedalPos >= 1 then 
8: PedalCmd = PedalPos + 2 

9: 

10: if PedalCmd - — ■ 1 then 

11: AltPress = 0 

12: else if PedalCmd == 2 then 

13: AltPress = 1/4 

14: 

15: if BSwitch == 0 then 
16: Meter = 1 

17: else if BSwitch == 1 then 
18: Meter — 2 

(a) 


Fig. 2 Code fragments from a simplified WBS example: 


1: /* Global State Variables */ 

2: int AltPress := 0 
3: int Meter := 2 
4: 

int update(int PedalPos , int BSwitch, int PedalCmd ) 
5: if PedalPos == 0 then 
6: PedalCmd = PedalPos + 1 

7: else if PedalPos >= 1 then 
8: PedalCmd = PedalPos + 2 

9: else PedalCmd = PedalPos * 1 
10 : 

11: if PedalCmd == 1 then 

12: AltPress = 0 

13: else if PedalCmd == 2 then 

14: AltPress =1/4 

15: 

16: if BSwitch == 0 then 
17: Meter = 1 

18: else if BSwitch == 1 then 
19: Meter = 2 

(b) 

(a) original version and (b) modified version. 


the assignment of PedalCmd does not have any impact 
on the block of statements at lines 16 — 19 in Fig. 2(b) 
because the value of PedalCmd is not used (read) at 
those lines; however, the block of statements at lines 
11 — 14 may potentially be impacted by the assignment 
to PedalCmd at line 9. The value of PedalCmd at lines 
11 and 13 is used to determine which code is to be ex- 
ecuted, i.e., line 12 or line 14. As a result, at the end 
of phase I of DiSE, the set of impacted statements will 
include the following statements: 

{ 

9 : else PedalCmd = PedalPos * 1, 

11 : if PedalCmd == 1 then, 

12 : AltPress = 0, 

13 : else if PedalCmd == 2 then, 

14 : AltPress = 1/4 

} 

This set of impacted statements will then be used to di- 
rect symbolic execution during the next phase of DiSE. 
Recall that during symbolic execution of the modified 
version of the code, checks are made to determine if any 
impacted program statements are reachable from the 
current program location. This ensures that only the 
impacted execution behaviors are explored and charac- 
terized. Let us consider the part of a path condition gen- 
erated during symbolic execution along the impacted 
set of program locations: 

PedalPos yf 0 A PedalPos < 1 A ( PedalPos * 1) == 1 

The variable PedalCmd is replaced with the value as- 
signed to it at line 9, PedalPos* 1. The constraint shown 
above is, however, not satisfiable, i.e., no assignment to 


PedalPos will make the constraint satisfiable. The first 
two constraints on PedalPos essentially specify that 
PedalPos is a negative number which contradicts the fi- 
nal constraint. Similarly another partial path condition 
generated along the impacted set of program locations 
is: 

PedalPos yf 0 A PedalPos < 1 A ( PedalPos * 1) == 2 

This path condition is also not satisfiable. Based on the 
results of symbolic execution we can then state conclu- 
sively that the change made to the assignment of Ped- 
alCmd does not impact the assignment to the global 
variable AltPress. Both path conditions generated along 
the impacted program locations in Fig. 2(b) show that 
the change made to the assignment of PedalCmd does 
not impact any other part of the program. 

5.4 Impact of changes 

The results of DiSE can now be used to color the im- 
pacted nodes in the Bayesian network. Nodes PedalCmd 
and U -PedalCmd are initially marked as impacted by 
the change in Fig. 3 based on the syntactic diff. The 
results of DiSE indicate that no additional nodes are 
impacted by the change made to the monitored pro- 
gram. This is a safe estimation of the impact of the 
change, in the sense that the analysis does not miss 
marking any impacted nodes. In other words, DiSE is 
a precise and conservative technique for generating the 
set of impacted program execution behaviors. 

The application of DiSE results to the coloring of 
the nodes in the Bayesian network is quite useful for 
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manually inspecting the effects of a change to the mon- 
itored system. The size of the Bayesian network can 
be very large when the monitored system is composed 
of large numbers of variables. In such cases, the color- 
ing of nodes is a helpful tool for visualizing the impact 
of the changes. When only a small number of nodes is 
impacted, it is easy to identify the parts of the net- 
work that do not need to be re-analyzed and tested in 
order to check their correctness. This can result in a 
significant savings while maintaining the health of the 
monitor. 

5.5 WBS Results 

In this section, we present a subset of the results of an 
evaluation of DiSE performed in [28] . Here we present 
the results for the WBS example to illustrate how changes 
to the code may impact the program execution behav- 
iors. Note that the entire WBS program consists of one 
method that is invoked from a main method. We refer 
to the WBS method and WBS program interchange- 
ably in this section. DiSE is implemented as an exten- 
sion of the Java Pathfinder toolkit [44]. The details of 
the implementation are described in Section 6. The goal 
of the evaluation was to answer two research questions: 
(RQ1) How does the cost of applying DiSE compare to 
full symbolic execution on the changed WBS program? 
(RQ2) How does the number of impacted path condi- 
tions generated by DiSE compare with the number of 
path conditions generated by full symbolic execution? 

In Table 1, we list the results of running DiSE and 
full symbolic, execution on each version of the WBS ex- 
ample. For each mutant (changed) version of WBS, we 
list the number of CFG nodes changed ( Changed ) and 
the number of CFG nodes impacted by the changes 
(Impacted). We also present the following metrics — the 
time to perform DiSE and the time to perform tradi- 
tional symbolic execution of the mutant version as re- 
ported by SPF, the number of states explored during 
execution of each technique, and the number of path 
conditions generated by each technique in the resulting 
method summary. The results for DiSE are listed un- 
der the subheading DiSE and the results for traditional 
symbolic execution are listed under the subheading Full 
Symbc. 

To evaluate DiSE on the WBS program we needed 
multiple versions of the program. We generated versions 
of the WBS program by manually creating mutants of 
the base version (vO) of WBS because multiple versions 
of the WBS program are not available. When creating 
mutants, we considered a broad range of changes that 
can be applied to the code: change location, change 
type and number of changes. We introduced changes 


at the beginning, middle and end of the WBS method. 
We also considered the control structures in the code, 
and make changes at various depths in nested control 
structures. Each mutant has one, two or three changed 
Java statements, resulting in up to nine changed nodes 
in the CFG for the changed version of the WBS pro- 
gram as shown in Table 1. Versions 1-6 contain a single 
changed Java source statement, versions 7-11 contain 
two changed statements, and versions 12-16 contain 
three changed statements. 

5.5.1 Results and Analysis 

RQ1 (Cost). In Table 1, we can see that for the major- 
ity of versions in the WBS program, DiSE takes consid- 
erably less time than full symbolic execution. In many 
cases, the differences in time is several orders of magni- 
tude. In the versions where the changes to the program 
do not impact all path conditions (program paths), 
DiSE takes at most 20% of the time taken by full sym- 
bolic execution. In the versions vl, v7, vlO, vl4, and 
vl5, where DiSE explores the same number of states as 
full symbolic execution, the time taken by DiSE is 9%- 
30% longer than symbolic execution. This extra execu- 
tion time accounts for the overhead of computing the 
impacted locations and supporting data structures. 

RQ2 (Effectiveness). The number of path con- 
ditions computed by DiSE varies greatly between the 
different versions of WBS. In the versions that DiSE 
generates the same number of states as full symbolic 
execution, the number of impacted path conditions are 
the same as the ones generated by full symbolic execu- 
tion. For most of the WBS versions, there fewer path 
conditions generated by DiSE than full symbolic exe- 
cution, e.g., DiSE generates half the number of path 
conditions than full symbolic execution. There is a re- 
duction in the number of path conditions generated for 
other versions: v2, v4, v5, v6, etc. 

Overall, the comparison demonstrates that DiSE 
has potential application for detecting and character- 
izing impacted program behaviors in evolving software. 
In the WBS program, DiSE correctly identifies and 
characterizes the subset of path conditions computed 
by full symbolic execution as impacted. In some in- 
stances, the change impacted only a small percentage 
of path conditions, and in others, the change(s) had a 
much greater impact. When only a subset of the path 
conditions were impacted by the changes, DiSE is able 
to consistently compute the impacted path conditions 
in less time — often several orders of magnitude — than 
full symbolic execution; when all of the path conditions 
were impacted by the changes, the overhead incurred by 
DiSE is between nine and 30% for the WBS mutants. 
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Version 

CFG 

Changed 

Nodes 

Impacted 

Tim 

DiSE 

e (mm:ss) 
Full Symbc 

States 

DiSE 

Explored 
Full Symbc 

Path 

DiSE 

Conditions 
Full Symbc 

vl 

i 

39 

03:19 

02:30 

677,976 

677,976 

24 

24 

v2 

i 

7 

00:08 

02:22 

93 

677,976 

17 

24 

v3 

i 

3 

00:27 

02:41 

65,976 

677,976 

12 

24 

v4 

i 

0 

00:08 

02:44 

17 

677,976 

1 

24 

v5 

7 

56 

00:23 

03:44 

59,610 

1,317,048 

14 

24 

v6 

1 

1 

00:08 

02:44 

17 

677,976 

1 

24 

v7 

1 

39 

03:07 

02:51 

677,976 

677,976 

24 

24 

v8 

8 

57 

00:29 

03:45 

59,610 

1,317,048 

14 

24 

v9 

2 

4 

00:33 

02:41 

65,976 

677,976 

12 

24 

vlO 

2 

39 

03:40 

02:51 

677,976 

677,976 

24 

24 

vll 

7 

56 

00:28 

03:43 

59,610 

1,317,048 

14 

24 

vl2 

8 

65 

00:31 

03:54 

70,129 

1,317,048 

6 

24 

vl3 

9 

57 

00:29 

03:44 

59,610 

1,317,048 

14 

24 

vl4 

3 

39 

03:39 

02:51 

677,976 

677,976 

24 

24 

vl5 

3 

42 

03:37 

02:51 

677,976 

677,976 

24 

24 

vl6 

8 

56 

00:28 

03:43 

59,610 

1,317,048 

14 

24 


Table 1 DiSE results for WBS 


The entire WBS program is 231 lines of Java source 
code. Since the WBS example is a single method, some 
changes could impact the entire method. In larger ex- 
amples, we expect that changes are more likely to be 
localized to certain methods or components. 

The results in this section illustrate the effectiveness 
of DiSE at characterizing the impact of changes on the 
execution behaviors of the modified code. In the context 
of software health management, these results illustrate 
the potential to reduce the cost and effort of maintain- 
ing the correctness of the SWHM monitor when the 
monitored code is changed by using DiSE. 

6 Tool Support 

DiSE is implemented within the Java Pathfinder [44] 
toolkit. It is an extension of the symbolic execution en- 
gine, Symbolic PathFinder [24,30]. 

6.1 Java PathFinder 

The Java PathFinder (JPF) model checker is an open- 
source Java bytecode analysis framework. The core of 
JPF is an explicit state model checker for Java byte- 
code. JPF is a customized Virtual Machine that sup- 
ports state storage, state matching, and configurable 
execution semantics of bytecode instructions. It sup- 
ports controlled scheduling choices in concurrent pro- 
grams, and monitoring of program executions with Ob- 
server design patterns. It checks for properties such as 
deadlock, race conditions, and the absence of unhandled 
exceptions. One of the defining qualities of JPF is its ex- 
tensibility. JPF has been extended to support symbolic 


execution, directed automated random testing, config- 
urable state abstractions, various heuristics for enabling 
bug detection, configurable search strategies, checking 
of temporal properties and much more. JPF supports 
these extensions at the design level through a set of 
stable, well-defined interfaces. 


6.2 Symbolic PathFinder 

Symbolic Pathfinder (SPF) is the symbolic execution 
engine for JPF. SPF is an open-source execution en- 
gine that symbolically executes Java bytecode. SPF 
supports a variety of constraint solvers/decision pro- 
cedures for solving path conditions such as Choco [8], 
IASolver [16], and CVC3 [10]. In general, state match- 
ing is undecidable when states represent path condi- 
tions on unbounded input data. Hence, SPF does not 
perform any state matching and explores the symbolic 
execution tree using a stateless search. Furthermore, if 
the solver is unable to determine the satisfiability of 
the path condition within a certain time bound, SPF 
treats the path condition as unsatisfiable. This limita- 
tion of the constraint solvers may cause symbolic exe- 
cution to not generate path conditions for feasible exe- 
cution paths. Loops and recursion can be bounded by 
placing a depth limit on the search depth in SPF or 
by limiting the number of constraints encoded for any 
given path; SPF indicates when one of these bounds 
has been reached during symbolic execution. 
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6.3 Directed Incremental Symbolic Execution 

DiSE extends SPF by implementing the custom data- 
flow and control-flow analyses used to compute the set 
of impacted program statements. The control- and data- 
flow analyses compute a conservative approximation of 
the impacted Java bytecode instructions in changed 
methods in the modified program. The implementation 
supports both intra-procedural analysis (data and con- 
trol flow within a method) and inter-procedural analy- 
sis (data and control flow across different method calls). 
The impacted Java bytecode instructions are used to di- 
rect symbolic execution along execution paths leading 
to impacted Java bytecode instructions, while the other 
paths are pruned. DiSE is implemented in an extension 
called jpf -regression. The output of DiSE is a set of 
path conditions that describe the constraints over the 
input and global variables. These constraints represent 
the impacted program behaviors of the modified pro- 
gram. 

We use the impacted program behaviors to charac- 
terize the impacted parts of the SWHM system manu- 
ally. As part of our future work, we plan to automate 
this process. 

7 Conclusions and Future Work 

Software health management techniques monitor de- 
ployed software in its execution environment to detect 
violations, predict possible failures, and to help the sys- 
tem recover from faults. When the monitored software 
is changed, the SWHM monitor software may also need 
to change in order to continue to operate correctly. In 
this work we describe how the results of Directed In- 
cremental Symbolic Execution, a general change im- 
pact analysis technique we developed previously, can 
be used to maintain the correctness of a SWHM moni- 
tor when the monitored software is changed. To the best 
of our knowledge, existing software health management 
techniques have not addressed the issue of maintaining 
the correctness of the SWHM monitor over time as the 
monitored software evolves. 

The particular SWHM monitor software analyzed 
in this work is based on Bayesian Networks. Although 
we have automated the analysis to compute the impact 
of the changes on the monitored software, we have not 
yet automated the process for updating the nodes in the 
Bayes Network to indicate the impact of the changes. 
For future work, we plan to automate this step and to 
apply DiSE to larger programs to empirically evaluate 
the effectiveness of this approach to maintaining the 
health of the SWHM monitor software. We also plan to 
explore how the results of DiSE can be used to support 


other aspects of software health management, and to 
apply DiSE results to other SWHM frameworks. 

We believe that the core concept of DiSE can be 
adapted and applied to other SWHM techniques as 
well. An approach has been described for the formal 
verification for the diagnostics systems using symbolic 
model checking [26]. The diagnosis system observes a 
physical system that is modeled as a Kripke structure. 
The DiSE algorithm could be adapted to generate the 
set of affected behaviors on the Kripke structure. The 
impacted behaviors can then be used to check the cor- 
rectness of the diagnosis system. There is another model- 
based prognostic technique that uses a simulation of a 
system collected under nominal, as well as degraded 
conditions [22]. DiSE could be adapted to generate im- 
pacted simulations of a system based on the changes. 
The impacted simulations could then be used to help 
maintain the prognostics system. 
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